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In Part I of this two-part paper [T], we proposed a new game, called Chinese restaurant game, to 
analyze the social learning problem with negative network externality. The best responses of agents in 
the Chinese restaurant game with imperfect signals are constructed through a recursive method, and the 
influence of both learning and network externality on the utilities of agents is studied. In Part II of 
this two-part paper, we illustrate three applications of Chinese restaurant game in wireless networking, 
cloud computing, and online social networking. For each application, we formulate the corresponding 
problem as a Chinese restaurant game and analyze how agents learn and make strategic decisions in the 
problem. The proposed method is compared with four common-sense methods in terms of agents' utilities 
and the overall system performance through simulations. We find that the proposed Chinese restaurant 
game theoretic approach indeed helps agents make better decisions and improves the overall system 
performance. Furthermore, agents with different decision orders have different advantages in terms of 
their utilities, which also verifies the conclusions drawn in Part I of this two-part paper. 



December 16, 2011 



DRAFT 



2 

I. Introduction 

The network externality, which describes the mutual influence among agents, plays an important role 
in numerous network-related applications. When the network externality is negative, i.e., the more agents 
make the same decision, the lower utilities they have in the network, agents tend to avoid making the same 
decision with others in order to maximize their utilities. This phenomenon has been observed in many 
applications in various research areas, such as dynamic spectrum accessing in cognitive radio networking, 
service selection in cloud computing, and deal selection on Groupon. Moreover, agents may not know 
the exact state of the network, such as the primary user's activity in the spectrum, the infrastructure of 
service platform, and qualities of products provided in deals. Therefore, they tend to learn such unknown 
information from some signals through measurements or actions from others in the network, which is 
called social learning. How these two effects, negative network externality and social learning, affect the 
decisions of agents in different network-related applications, is the main objective of this paper. Chinese 
Restaurant Game, proposed in Part I of this two-part paper ITl, provides a general framework for modeling 
strategic learning and decision processes in the social learning problem with negative network extemahty. 

Here, we briefly describe the proposed Chinese restaurant game. We consider a Chinese restaurant with 
K tables numbered 1, 2, ...,K and N customers labeled with 1, 2, N . Each table has infinite seats, but 
may be in different size. We model the table sizes of a restaurant with two components: the restaurant 
state and the table size functions {Ri{9), R2{0), Rk{0)}. The restaurant state 6 is unknown to the 
customers. However, each customer i has received a signal Sj following a commonly known distribution 
f{s\6). Therefore, customers may learn the state from the signals they collected. In the Chinese restaurant 
game, each customer sequentially requests for the table following the same order as their numbering. 
After customer i made his request, he reveals both his decision and the signal he received. Let Xi be the 
table requested by customer i. Then, customer i's payoff is determined by a utility function Ui{Rxi,nxi), 
where n^;. is the number of customers choosing table Xj. 

The best responses of rational customers that maximize their expected utilities can be recursively 
derived in Chinese restaurant game We denote n = {ni, n2, as the numbers of customers 
on the K tables, i.e., the grouping of customers in the restaurant at the end of the game. Let nj = 
{ni,i,ni^2, ■■■,ni^K} be the grouping observed by customer i when making his decision, and hi = 
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{si, S2, Sj-i} be the history of revealed signals before customer i. The best response of customer 
i can be written as 

Xi = BEi{ni, hi, Si) = aigmax E[ui{Rj (6), nj)\ni, hi, Si]. (1) 

j 

Customers in the Chinese restaurant game need to estimate the current state 6 with the signals they 
collected in order to make the right decisions. Let the prior probability of the state be go = {go,i\go,i = 
Pr{9 = I), V/ € ©}. We denote customer i's estimate on the current state as the belief being defined 

gi = {gi,i\gi,i = Pr{9 = l\hi, Si, go, f), V/ € 0} Vi € N. Then, customer i can construct his belief 
according to the Bayesian rule as follows: 

Pr{hi,Si\e = l)Pr{e = l) gi_i,if{s^\d = l) 

gi,i = — J = — J • (^) 

Ylw=iP^(^K'^i^si\e = w)Pr{e = w) Ylw=i9i~i,wf{si\0 = w) 

Besides the estimate on the state 6, customer i also needs to predict the decisions of subsequent 
customers due to the negative network externality effect. Let niij be the number of customers choosing 
table j after customer i, including customer i himself. As shown in HI, we have 



Pr{mij = X\ni, hi. Si, Xi,0 = I) = < 



Pr{mi+ij = X - l|ni,hi,Sj,rcj,6' = I), Xi = j, 

(^) 

Pr{mi+ij = X\ni, hi, Si,Xi, 6 = I), Xi / j, 



Ylo<u<K X*6S,+i,„(ni+i,hi+i) Pr{mi+i,j = X - l|ni+i, hi+i, s^+i = s, Xi+i = u,e = l)f{s\9 = l)ds, Xi = j, 
Eo<«</^LeS,+i,4ni+i,hi+o^^^"^*+i'-'' " ^|ni+i,hi+i,Si+i = s,Xi+i = u,e = l)f{s\e = l)ds, Xii^j, 
where hi+i and rii+i can be obtained using 

hi+i = {hi, Si} and rii+i = {ni+i,i, ni+i,x}, (4) 

with 



Ui k + 1, if Xi = k, 

(5) 

Hi k, Otherwise. 



Based on Q, Pr{mij = X\ni,hi,Si,Xi,9 = I) can be recursively calculated. Then, we can compute 
the expected utility E[ui{Rj{6), nj)\ni, hi, Si] and derive the best response function of customer i as 

N-i+l 

BEi{ni, hi. Si) = argmaxy^ ^ gi^iPr{mij = x\ni, hi, Si,Xi = j, 6 = l)ui{Rj{l),nij + x). (6) 
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With the recursive form, the best response function of all customers can be obtained using backward 
induction. In Part I of this two-part paper HI, we generally discussed how customers make decisions in 
the Chinese restaurant game under different signal qualities and table size ratios. In Part II of this two-part 
paper, we would like to illustrate how Chinese restaurant game can be used in specific applications in 
different research areas. In Section JIl we first describe four strategies that we will compare our method 
with. Then, we will show the improvement from the proposed best response strategy on both the individual 
utility and the overall system efficiency in three applications: dynamic spectrum access in cognitive radio 
networking, storage service selection in cloud computing, and deal selection on Groupon in online social 
networking in Section Hill |lVl and |Vl respectively. For each application, we first formulate the problem 
using the Chinese restaurant game. Then, we compare the proposed best response strategy with other 
strategies in terms of agents' utilities and the overall system performance through simulations. Finally, 
we draw our conclusions in Section |Vll 



We will compare our best response strategy with the following four strategies: random, signal, learning, 
and myopic strategies. In the random strategy, customers choose their strategies randomly and uniformly, 
i.e., all K tables have equal probability of to be chosen under the random strategy. In the signal 
strategy, customers make their decisions purely based on their own signal regardless all information 
from other customers, including the revealed signals and their choices on tables. The objective of signal 
strategy is to choose the largest expected table size conditioning on his signal given by 



The learning strategy is an extension of the signal strategy. Under this strategy, the customer learns 
the system state not only by his own signal but also by the signals revealed by the previous customers. 
Therefore, the learning strategy can be obtained as 



II. Strategies for Comparisons 




(7) 




(8) 



where gi^i = Pr{6 = l\si, S2, Sj, go) is the belief of the customer on the state. 
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Finally, the myopic strategy simulates the behavior of a myopic player. The objective of a customer un- 
der myopic strategy is maximizing his current utility, i.e., the customer makes the decision according to his 
own signal, all signals revealed by previous customers, and the current grouping rii = {n^ i, nj^2; "-2,/^} 
as follows, 



From we can see that the myopic strategy is similar to the proposed best response strategy except the 
Bayesian prediction of the subsequent customers' decisions. The performance of all these four strategies 
will be evaluated in all simulations in the following applications. They will be treated as the baseline of 
the system performance without fully rational behaviors of customers. 

III. Wireless Networking: Dynamic Spectrum Access in Cognitive Radio Network 

Traditional dynamic spectrum access methods focus on identifying available spectrum through spectrum 
sensing. Cooperative spectrum sensing is a potential scheme to enhance the accuracy and efficiency 
of detecting available spectrum |[2l-||4l. In cooperative spectrum sensing, the sensing results from the 
secondary users are shared by all members within the same or neighboring networks. These secondary 
users then use the collected results to make spectrum access decision collaboratively or individually. If 
the sensing results are independent from each other, the cooperative spectrum sensing can significantly 
increase the accuracy of detecting the primary user's activity. Secondary users can learn from others' 
sensing results to improve their knowledge on the primary user's activity. After the available spectrum 
is detected, secondary users need to share the spectrum following some predetermined access policy. In 
general, the more secondary users access the same channel, the less available access time for each of 
them, i.e., a negative network externality exists in this problem. Therefore, before making decision on 
spectrum access, a secondary user should estimate both the primary user's activity based on the collected 
sensing results and the possible number of secondary users accessing the same spectrum. 

A. System Model 

We consider a cognitive radio system with K channels, N secondary users, and one primary user. We 
assume that the spectrum access behavior of secondary users is organized by an access point. Suppose 




(9) 
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(b) Choose Channels and Broadcast Signals 
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(c) Transmission 
Fig. 1. Sequential Cooperative Spectrum Sensing and Accessing 



that the primary user is always active and transmitting some data on one of the channels. In addition, 
the primary user's access time is slotted. At each time slot, each channel has equal probability of l/K 
to be selected by the primary user for transmission. The secondary users' activities are shown in Fig. [T] 
At the beginning of each time slot, secondary users individually perform sensing on all channels 1 ~ ii'. 
Then, they follow a predefined order to sequentially determine which channel they are going to access 
in this time slot. Without loss of generality, we assume they follow the same order as their indices. 
When making a decision, a secondary user i reports his decision and the sensing result to the access 
point. At the same time, all secondary users also receive this report by overhearing. After all secondary 
users have made their decisions, the access point announces the access policy of each channel: secondary 
users choosing the same channel equally share the slot time. However, if the channel is occupied by the 
primary user, their transmission will fail due to the interference from primary user's transmission. 

Such a cognitive radio system can be modeled as a Chinese restaurant game. Let Hj be the hypothesis 
that channel j is occupied by the primary user. Then, let the sensing results of secondary user i G 
{1,2,...,A^} on channel j € {1,2, K} be Sij. We use a simple binary model on the sensing result 
in this example, where Sij = 1 if the secondary user detected some activity on channel j and Sij = 
if no activity is detected on channel j. For secondary user i, his own sensing results are denoted as 
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Si = {sj,i, Sj,25 In addition, the results he collected from the reports of previous users are 

denoted as hi = {si, S2, ...sj^i}. 

We define the belief of a secondary user i on the occupation of channels as gi = {gi.i,9i,2, ■■■,9i,K}, 
where gij = Pr{Hj\h.i, Si). Let the probability of false alarm and miss detection of the sensing technique 
on a single channel as and pm, respectively. The probability of Si conditioning on Hj is given by 

Pr{si\Hj)=pl^'^-{l-p^y^- II Pf-'il-PfY-'-'. (10) 

k&{l,2,...,K}\{j} 

Thus, we have the following belief updating rule 

Pr{U,s,\H,)Pr{H,) _ g,_i^,Pr{si\Hj) 

With this rule, the belief of secondary user i is updated when a new sensing result is reported to the 
access point. The available access time of a channel j within a slot is its slot time, which is denoted as 
T. However, if the channel occupied by primary user, its access time becomes 0. Thus, we define the 
access time of channel 7 as 

0, j = k. 

Rj{Hk) = I (12) 
T, otherwise. 

Then, let Xi be secondary user z's choice on the channels, and Uj be the number of secondary users 
choosing channel j. We define the utility of a secondary user i as 

/ \ QxiRxi /nx 
Ui{Xi) = , (13) 

nx, 

where Qx, is the channel quality of channel Xi. If the channel has higher quality, the secondary users 
choosing the channel have higher data rates, and thus higher utility. Then, the best response of secondary 
user i is as follows, 

'QxT, 



B£;i(ni, hi, Si) = argmax ^ gi^kE 

ke{l,2,...,K}\{x} 



rix 



|ni,hi,Si,iJfc 



(14) 



This best response function can be solved recursively through the recursive equations in Q and ([6]). 
B. Simulation Results 

We simulate a cognitive radio network with 3 channels, 1 primary user, and 7 secondary users. When 
the channel is not occupied by the primary user, the available access time for secondary users in one time 
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slot is 100ms. Secondary users sense the primary user's activity in all three channels at the beginning 
of the time slot. We assume that the primary user has equal probability to occupy one of three channels. 
Conditioning on the primary user's occupation of the channel, the probabilities of miss detection (if 
occupied) and false alarm (if not occupied) in sensing one channel are 0.1. The channel quality factor of 
channel 1 is Qi = 1, while channel 2 and 3 are 1 — d and 1 — 2d. The d is the degraded factor, which is 
controlled within [5%, 50%] in the simulations. We compare the proposed best response strategy in ^ 
with other four strategy scheme: random, signal, learning, and myopic strategies. The simulation results 
are shown in Fig. |2] 

From Fig. 2(a)[ |2(b)[ and 2(c) we can see that secondary users have different utilities under different 
orders and schemes. For both the myopic and the proposed best response schemes, secondary user 3 
has a larger utility than secondary user 1 when the degraded factor is low. This is because secondary 
user 3 has the advantages in collecting more signals than secondary 1 to identify the channel occupied 
by the primary user. Moreover, the loadings of the other two channels are still far from their expected 
equilibrium loadings since only two secondary users have made choices. Therefore, secondary user 3 has 
a larger utihty than secondary user 1. Nevertheless, when the degraded factor is high, we can see that 
the utility of secondary user 1 is larger than that of secondary user 3. This is because when the degraded 
factor increases, the quality difference among channels increases. In such a case, even secondary user 3 
can successfully identify the occupied channel, the channel that offers a higher utility in the equilibrium is 
usually the one with fewer number of secondary users. The expected number of secondary users accessing 
such a channel is generally 2 or even 1 depending on the degraded factor, and secondary user 3 can no 
longer freely choose those channels. For secondary user 7, who has the best knowledge on the primary 
user's activity, he usually has no choice since there are six secondary users making decisions before him. 
Therefore, he has the smallest utility. 

Generally, the myopic scheme provides an equal or lower utility than the best response scheme for 
secondary users making decisions early, such as secondary user 1, since secondary users in the myopic 
scheme do not predict the decisions of subsequent users. However, some secondary users eventually 
benefit from the mistakes made by early secondary users. We can see from Fig. |2(b)| that for the cases 
that d = 20% and d = 50%, customer 3 has a higher utility under the myopic scheme than under the 
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(e) Secondary Users Interfere the Primary User 
Fig. 2. Spectrum Accessing in Cognitive Radio Network under Different Schemes 
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best response scheme due to the mistakes made by customer 1 and 2. We can also see from Fig. |2(d)| 
that both best response and myopic schemes provides the same average utiUties of all secondary users. 
In such a case, the utility loss of some secondary users in the myopic scheme will lead to the utility 
increase of some other secondary users. 

For random and signal schemes, there is no difference among the average utilities of secondary user 
1, 3, and 7 since secondary users do not learn from other agents' actions and signals under these two 
schemes. For the learning scheme, we can see that secondary user 1 has a significantly larger utility than 
secondary user 3 and 7. This is because in the learning scheme, secondary users do not take the negative 
network externality into account when making decisions on the channel selection. Since secondary users 
who made decisions later are more likely to identify the primary user's activity, they are more likely to 
choose the same channels and share with each other, and their utilities are degraded due to the negative 
network externality. 



Let us take a deeper look at the average utility of all secondary users shown in Fig. |2(d)[ On one hand, 
we can see that both best response and myopic schemes achieve highest average utilities of all secondary 
users. The network externality effects in spectrum access force strategic secondary users to access different 
channels instead of accessing the same high quality channels. On the other hand, learning and signal 
schemes lead to poor average utilities since they do not consider the network externality in their decision 
processes. All secondary users tend to access the same available high quality channel, and therefore the 
spectrum resource in other available channels is wasted. This also explains the phenomenon that learning 
scheme leads to poorer performance than signal scheme. Under the learning scheme, secondary users 
are more likely to reach a consensus on the primary user's activity and make the same choice on the 
channels, which degrades the overall system performance. 

Finally, we show the number of secondary users causing interference to the primary user in Fig. 
|2(e)[ We can see that those schemes involving learning, which are best response, myopic, and learning 
schemes, can significantly reduce the interference to the primary user. Secondary users who learn from 
others' signals efficiently avoid the channel occupied by the primary user. 
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Fig. 3. Platform Selection in Cloud Computing 

IV. Cloud Computing: Cloud Storage Service Selection 

In the second application, we consider the storage service selection in cloud computing. Reliability and 
availability are the major concerns of potential cloud subscribers in choosing the cloud storage platform. 
These two features may be enhanced by the cloud computing platform through optimizing configuration, 
upgrading the software architecture, and hardware infrastructure HI. However, reliability and availability 
are also affected by the number of subscribers. For instance, when overwhelming number of subscribers 
access the storage service, these subscribers may experience unacceptable waiting time, service blocking, 
or even data loss in transmission due to network congestion and capacity overloading. The reliability of the 
platform may be decreased to an unacceptable level even with upgraded infrastructure 161. In general, for 
a fixed software architecture and hardware infrastructure, the more subscribers using the same platform, 
the lower the service quality of the cloud storage platform. Thus, before choosing the cloud service, a 
potential subscriber should consider both the infrastructure of the platform and the potential growth of 
the subscription numbers of the platform. 
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A. System Model 

Let us consider two cloud storage platforms with N potential subscribers. These two platforms, say 
A and B, offer the same storage plan. A storage plan contains two items: maximum storage space and 
pricing. These plans require long-term contracts, which means that a subscriber cannot change from 
one platform to another in a short period. Since both platforms provide the same plan, a subscriber 
may not choose one of them due to the price or the storage size. Suppose that the service qualities 
of these two platforms are different. One of the platforms has upgraded their infrastructure and can 
provide higher availability. Let us consider a binary model on the infrastructure, i.e., and Hi are 
high-reliable infrastructure and low-reliable infrastructure, respectively. When high-reliable infrastructure 
Hh is used in the platform, the probability that one subscriber causing the service to fatal crash is ph.. 
If the crash happens, the platform becomes unavailable to all subscribers at a certain time. When low- 
reliable infrastructure Hi is used, the probability of a fatal crash per subscriber is pi, where pi > ph- We 
assume that there is no third-party to tell the subscribers what exactly the infrastructure the platforms 
are using. Thus, both platform may claim that they have upgraded the infrastructure, and the subscribers 
may not be able to verify it before choosing their services. However, they all know the prior probability 
go that platform A is the one that upgrades the infrastructure. They also have collected some rumors 
about the true state of each platform, and have formed their own believes on the infrastructure of each 
platform. 

We assume these subscribers make their choices sequentially. When a subscriber i makes his choice, 
as shown in Fig. [3l he announces his decision and posts the rumors he received on a public discussion 
forum. Thus, all subscribers know his decision and get the rumor he received. We assume the decision 
process is relatively short comparing with the required long-term contracts for using the storage service. 
Thus, we ignore the transition state and focus on the service availability of the final state, i.e, the time 
after all subscribers have made their decisions. The service availability experienced by one subscriber is 
determined by two factors, which are the number of subscribers at the end of the decision process and 
the infrastructure used by the platform. 

Let the rumor received by subscriber i be a random variable Sj € {A, B} conditioning on the 
infrastructure used by each platform. When Si = A, the subscriber receives a rumor favoring platform 
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A, otherwise the rumor favors platform B. Let Ha and Hb be the infrastructure using by platform A 
and B, respectively. The probability distribution f{si\H) is defined as follows: 

p, = Hh, 



f{si = x\Hx) = < 



(15) 

I — p, otherwise, 



where x G {A,B} and p is the conditional probability that the rumor is true. Note that since only 
one platform upgrades the infrastructure, if Ha = H^, then Hb = Hi, and vice versa. The belief of a 
subscriber i is defined as gi = Pr{HA = Hh\h.i,Si), where hi = {si, S2, Sj-i} is the collection of 
the posted rumors on the public discussion forum. The belief updating rule is given as follows: 

Pr{hi,Si\HA = Hh)Pr{HA = Hh) 



^' Pr(hi, Si\HA = nhjr-ryriA = rih) ^ r-i\iii, ■^i\nA = nijr-ryiiA = m) 

(u 1 UrAH A = Hu) 

(16) 



gi~if{si\HA = Hh) + (1 - gi^i)f{si\HA = Hi) • 
The utility function of a subscriber i choosing the platform x € {A, B} is given by 

Ui{x) = {I - p^T% (17) 

where px is the probability of service fatal crash caused by one subscriber and rix is the number of 
subscribers using platform x. The utility function Ui{x) describes the probability that the cloud storage 
service is available, given the number of subscribers and the reliability of infrastructure. 

Finally, let Ui^x be the number of subscribers choosing platform x before subscriber i, we have rii^A = 
i — 1 — rii^B- Then, the best choice of a subscriber i is 

A E[{l-pAT^\n^,AM,Si]> E[{l-pBT-\ni^AM.s^]. 
BEi{ni^AM,Si) = < (18) 

i?, otherwise. 

This best response function can be solved recursively through the recursive equations in ^ and 
B. Simulation Results 

We simulate a system with two storage service platforms and 10 subscribers. There are two types of 
infrastructures, which are high-reliable and low-reliable infrastructure. The high-reliable infrastructure H^ 
offers a service with the probability of failure per user ph = 0.0001. For the low-reliable infrastructure, 
the probability of failure per user is pi. Each subscriber collects a rumor from the Internet, which favors 
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(a) Subscriber 1 (b) Subscriber 10 

Fig. 4. Storage Platform Selection in Cloud Computing vs. Accuracy of Rumors 



one of these two platforms. In the first simulation, we discuss how subscribers make decisions given 
different accuracy of the collected rumors. In this simulation, the parameter is set to be 0.0005 and p 
lies in [0.55, 0.95]. The simulation results are shown in Fig. ID 

From Fig. |4(a)[ we can see that the proposed scheme can provide the largest utility for subscriber 1 
and the utility increases as the accuracy of the rumors increases. With the myopic scheme, the utility 
of subscriber 1 decreases since the decisions of subsequent subscribers are not taken into account. 
Nevertheless, by considering the negative network externality effect in the decision process, the myopic 
scheme performs better than signal and learning schemes. Similar results can be observed for the utility 
of subscriber 10, as shown in Fig. |4(b)[ The only difference is that the best response scheme provides 
similar utility to subscriber 10 compared with the myopic scheme. The reason is that subscriber 10 is the 
last subscriber and does not need need to predict any other subscribers' decisions. From Fig. |4(b)[ we 
also observe that the utility of subscriber 10 with the best response scheme decreases when p is larger 
than 0.8. This is because as the accuracy of the rumor increases, the subscribers before subscriber 10 can 
better identify the platforms' infrastructures and thus make better decisions. In such a case, subscriber 
10 has a less chance to choose the better platform before it reaches the expected number of subscribers 
in equilibrium. 

Then, we simulate with G [0.0002,0.001] and p = 0.7 to see how subscribers make choices given 
different reliability of low-reliable platform. The simulation results are shown in Fig. |5] From Fig. |5(a) 
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(c) Average Utility of All Subscribers (d) Reliability under High- and Low-reliable Platforms 

Fig. 5. Storage Platform Selection in Cloud Computing vs. Reliability of Platforms 



and |5(b)[ we observe that best response scheme gives both subscriber 1 and subscriber 10 the largest 
utilities among all schemes. Moreover, subscriber 1 has a larger utility than subscriber 10 when pi is low, 
but has a smaller utility when pi is high. This is because when pi is low and close to p^, the network 
externality effect dominates the reliability of the platform. Subscriber 1 has the advantage to choose the 
one offering the higher reliability in equilibrium before it is over-crowded. However, when pi is high, 
the difference in infrastructure dominates the reliability of the platform. Subscribers would prefer the 
platform with the high-reliable infrastructure even other subscribers already choose the same platform. In 
such a case, subscriber 10 has the advantage to identify the platform with the high-reliable infrastructure 
with the collected rumors. 

The myopic scheme offers a lower utility for subscriber 1 since it lacks the prediction on the decisions 



December 16, 2011 



DRAFT 



16 

of subsequent subscribers. For subscriber 10, however, there is almost no difference between these two 
schemes. This is because subscriber 10, as the last subscriber, does not have to predict the decisions 
of other subscribers. For signal and learning schemes, subscribers have lower utilities since they do 
not take the network externality into account. In most cases, the signal scheme provides a larger utility 
to subscribers than the learning scheme since subscribers achieve some kind of load balance between 
these two platforms according to the signal distribution f{s\9). Note that when pi is high, the utility of 
subscriber 10 under the learning scheme is larger than that under the signal scheme. This is because when 
pi is high, the platform with high-reliable infrastructure provides higher rehability even it serves more 
subscribers. In such a case, a subscriber should try to identify and choose the high-reliable platform, which 
is exactly what the learning strategy does. The average utility of all subscribers under different schemes 



are shown in Fig. |5(c)[ We can see that both best response and myopic schemes provide the highest 
average utility and the signal scheme performs better than the learning scheme due to the load-balancing 
from the signal distribution f{s\6). 

In Fig. |5(d)[ we evaluate the average reliability of each platform under different schemes. Note that 
in Fig. |5(d)| if the platform is not chosen by any subscriber in the simulation, we assume the reliability 
of that platform is 1. For the platform with the high-reliable infrastructure, we can see that it has the 
highest reliability under the signal scheme and the lowest reliability under the learning scheme. On the 
contrary, the low-reliable infrastructure has the highest reliability under the learning scheme and the 
lowest reliability under the signal scheme. We can see that both best response and myopic schemes 
provides a better load balance between these two platforms. The reliability of the low-reliable platform 
is significantly improved without much loss on the high-reliable platform. 

V. Online Social Networking: Deal Selection on Groupon 

Finally, we consider the effect of negative network externality on online social networking such as 
Groupon. Recently, a new social network called Groupon shows a new possibility of e-commerce business 
model Q. As shown in Fig.[6l it offers small businesses, especially local restaurants, a platform to promote 
their products with significant discounted deals. These deals are mostly effective only in a limited time or 
even in a limited amount for advertising purpose. Customers who purchase deals on Groupon may also 
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Fig. 6. Deals on Groupon 

promote the deal to other social networks like Facebook or Twitter through the built-in tools provided 
by Groupon. However, it has been observed that when one product or service has been successfully 
promoted through deals, it is likely to receive negative responses and low reputation due to degraded 
qualities of services or over-expectation on the products HI. For example, a local restaurant may provide 
a 50%-off deal for advertising purpose. However, this deal may be sold in thousands, which means that 
the restaurant needs to serve a huge number of customers in months with little profit. In such a case, the 
quality of meal and service will be degraded, otherwise the restaurant may not be able to survive |[9l . 
Thus, a negative network externality exists in this problem. The customer should take into account the 
possibility of degraded service quality when choosing the deals on Groupon. 

A. System Model 

Let us consider K deal sites, each of them offering a deal from a restaurant. These restaurants are of 
the same kind and in the same area, which means that they share the same group of potential customers. 
We assume there are potential customers for these restaurants. The deals these restaurants provide 
have the similar time limitation, i.e., in general one customer will purchase only one of them. The price 
of the deal provided by restaurant j, Cj, is known by all customers. On the contrary, the quality of 
meals in the restaurant j, Qj, is unknown since these customers have not visited the restaurant. However, 
they may collect some reviews on the Internet to estimate Qj. We model Qj with a binary model, i.e., 

December 16, 2011 DRAFT 



18 

Qj ^ {Qh, Ql} where Qh and Qi denote high and low quaUty, respectively. We also assume the quality of 
restaurants are uncorrelated. The customers purchase deals sequentially. After one customer purchases 
a deal, his purchase along with the review he found will be posted on some public social networks 
that can be seen by all customers. Finally, after all purchases have been made, each customer visits the 
restaurant according to the deal he purchased for a meal. 

The review customer i found on one restaurant j, denoted as Sj j, may be positive Sp or negative s„ 
in probability. Conditioning on the quality of the restaurant Qj, the probability distribution is described 
as follows: 

p, Si j — Spi Qj — Qii or Sj J — Syi, Qj — Qu 



f{sid\Qj) = < 



(19) 

I — p, otherwise, 



where the p represents the quality of reviews. We denote Si = {si,i, Si^2, Sij^} as the reviews customer i 

collected for all restaurants and denote hi = {si, S2, Si_i} as the reviews shared by previous customers 

1 ~ i — 1. Then, the belief of a customer i on restaurant j's quality is given as follows: 

Pr(hi,Si\Qj = Qh)PriQj = Qh) 

Pr{hi,Si\Qj = Qh)Pr{Q, = Qh) + Pr{hi,Si\Qj = Qi)Pr{Qj = Qi) 

9i-i,jf{si,j\Qj = Qh) ^20) 

9i~i,jf{si,j\Qj = Qh) + (1 - 9i~i,j)fisi,j\Qj = Ql) 

Note that since the quality of restaurants are uncorrelated, the belief on one restaurant is only determined 
by the reviews related to it. 

Finally, let Uj be the number of customers purchasing restaurant j's deal, Xi be the choice of customer 
i, then the utility function of customer i can be defined as 

Ui{Qx,,nx,,Cx,) = RiQx,) - dux, - Cx,, (21) 

where R{Q) is the value function of a customer on the quality of the meal and d is the crowd discount 
factor. Here, we assume the degradation of restaurant quality is linear to the number of customers. 
According to the utility function, we have 

E[ui{Qj,nj,Cj)\ni, hi, Si, Xi = j] = E[R{Qj)\hi,Si] - dE[nj\ni,hi,Si,Xi = j] - Cj. ill) 
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B. Simplification with Linear Utility Function 

With the linear utiUty function in (l22l ). we can significantly simplify the original recursive solution. 
Due to the linearity of the utility function, the expected utility is determined by the expected value of 
R{Qj) and the expected number of customers choosing j. The expected value of R{Qj) can be derived 
directly through current belief of the customer i as 

E[RiQj)\hi,Si] = gi,,RiQh) + {I - g^,j)RiQl)■ (23) 

For the expected number of customers choosing j, we first define rrii j as the number of customers 
choosing j after customer i (including customer i himself). Then, we have 

E[nj\ni,hi,Si] = riij + E[mij\ni,hi,Si]. (24) 

According to (l22l) . (l23l) . and (l24l) . the best response function of customer i is given by 

BEi{ni,hi,Si) = avgmax{gi,jR{Qh) + (l - gi,j)R{Qi) - duij - Cj - dE[mij\iii,hi,Si, Xi = j]}. (25) 

3 

In (|25] ). the only unknown term is E[mij\n\,h.i,Si, Xi = j], which can be calculated recursively. 
Suppose that we have the best response function of customer i + 1. The best response of customer i + 1 
is Xi+i = j if and only if Si+i E 5i+ij(ni+i, hi+i). 

Conditioning on the current belief gi, the probability distribution of reviews /(s|gi) is given as 

■^^^'^'^ " ft g,J{s,\Q, = (1 = Qi) ^^^^ 

Then, the recursive form to compute E[mij\ni,h.i,Si,Xi\ is given as follows: 

1 + £'[mi+ijjni,hi,Si,Xi], Xi=j, 
E[mi+ij\ni,hi,Si,Xi], Xi^j, 

l + E«G/^/.GS.+i,4ni+,,hi+0^["'' + l'jl"'+l''^'+l'^'^^+l = u]/(s|gi)(is, X^=j, 

I ) 

with hi_|_i and ni+i being defined in dUl and 



E[mi j\ni,hi, Si, Xi] = < 
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C. Simulation Results 

We simulate a deal-offering website with two deals from two different restaurants and 9 customers. 
These two deals offer similar meals and have the same price of 5. We assume that there are two types 
of restaurant, which are the high quality restaurant with Qh = 30 and the low quality restaurant with 
Qi € [10,28]. Let the crowd discounting factor d in the utility function be 2. The probability that one 
restaurant's quality is high or low are equal. Moreover, a restaurant's quality is independent from the 
other restaurant's quality. Conditioning on the restaurant's quality, a customer receives a positive review 
on the restaurant with a probability of 0.7 if the restaurant's quality is high. Similarly , if the restaurant's 
quality is low, the probability that a customer receives a negative review on the restaurant is also 0.7. 
Each customer receives a review at the beginning of the simulation. Then, they choose the deals and 
reveal their collected reviews to other customers sequentially. We compare the best response strategy in 
^ with other four strategies: random, signal, learning, and myopic strategies. The simulation results are 
shown in Fig. |7] 

As shown in Fig. 7(a) and |7(b)[ we can see that customers with different orders have different average 



utilities except the random scheme. For the best response scheme, when Qi is low, customer 1 has the 
smaller average utility than customer 9. This is because the restaurant quality difference between high 
and low is more significant when Qi is low, and the major factor that determines the utility in this case 
is the quality of the restaurant. Since customer 9 has the advantage to collect more reviews and know 
the quality of restaurants with a stronger belief, he can choose more wisely on the deals and thus larger 
utility. On the contrary, when Qi is high, the difference between the restaurant quality is small, and the 
network externality becomes the major factor in determining the utility of the customers. In such a case, 
early customer, such as customer 1, have the advantage to choose the restaurant that has a lower number 
of customers at the equilibrium. 

The myopic scheme provides a smaller utility for customer 1 than the best response scheme since the 
expected decisions of subsequent customers are not taken into account under this scheme. For customer 
9, who is the last customer in the simulation, the utility difference becomes indistinguishable between 
myopic and best response scheme. We can also see that customers have the lowest utility under the 
learning scheme. This is because all customers are likely to choose the same deal without considering 
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Fig. 7. Deal Selection in Groupon under Different Schemes 



the network externality effect under the learning scheme, which significantly reduces the quality of the 
meals and thus smaller utilities. This in some sense also reflects the phenomenon we previously mentioned 
that the combination of traditional social learning and deal promotion on Groupon may eventually reduce 
the quality of the products if the customers make decisions from learning without considering the network 
externality effect HI, 191. Customers with the signal scheme have higher utilities than with the learning 
scheme since some customers may receive different signals and thus make different decisions. The random 
scheme performs better than signal and learning schemes in the simulations since it equally separates 
the crowd in probability and therefore prevents the severe quality degradation from the negative network 
externality. However, it does not perform better than myopic and best response scheme since both of 
these schemes jointly consider the network externality and the social learning. 
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D. Deal Pricing 

Finally, we would like to study how a new restaurant should price their deals to promote himself 
and maximize the revenue given other existing restaurants through the Chinese restaurant game. Let us 
consider a new restaurant who tries to enter the market. The owner would like to promote his restaurant 
through putting a new deal on the Groupon with a discounted price. The restaurant's quality is not known 
by the customers before opening, but the restaurant may do some advertisements or invite reviewers to 
give reviews on the restaurant's quality. These become signals to potential customers about the quality 
of the restaurant. Note that both methods can be controlled by the restaurant, i.e, the signal quality can 
be controlled by the new restaurant. Therefore, the owner need to not only determine the deal price but 
also control the signal quality. Depending on the true quality of this new restaurant, the optimal deal 
price and the signal quality can be determined through the Chinese restaurant game. 

Here, we provide a numerical analysis on how a restaurant should determine the deal price and the 
signal quality. We consider a deal-offering website with two deals, one from an existing restaurants A, 
and one from a new restaurant B. We assume that there are two types of restaurant, which are the high 
quality restaurant with = 25 and the low quality restaurant with Qi = 10. The quality of restaurant A 
is Qh, which is already known by all customers. Nevertheless, the new restaurant's quality is unknown to 
all customers. The probabilities that this new restaurant's quality is high or low are equal. Conditioning 
on the new restaurant's quality, a customer receives a positive review on the restaurant with a probability 
of p if the restaurant's quality is high. Similarly, if the restaurant's quality is low, the probability that 
a customer receives a negative review on the restaurant is also p. Note that p is controlled by the new 
restaurant as we discussed before. Suppose that the price of the deal offered by the existing restaurant is 
10. For the deal offered by the new restaurant, its price is denoted as c. All other settings are the same as 
previous simulations. We examine the expected revenue of the new restaurant through simulations. Let 
p € [0, 0.5] and c G [1, 20], the expected number of customers and revenue given the quaUty of the new 
restaurant are shown in Fig. [8] 

From Fig. 8(a) and |8(b)[ we can see that the number of customers choosing the new restaurant always 



increases as the deal price is lower, which is intuitive since a lower price means a higher utility to all 
customers regardless the restaurant's quahty. The signal quaUty, on the other hand, has a contradictory 
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Fig. 8. Influence of Deal Price and Signal Quality on the Number of Customers and Revenue of New Restaurant 



effect on the restaurant. We can see that when the signal quaUty increases, the number of customers 
choosing the new restaurant increases if and only if it is high-quality. Otherwise, the number of customers 
decreases when the signal quality increases. This is because when the signal quality is high, customers 
are more likely to identify the true quality of the restaurants. If the restaurant is high-quality, they are 
more willing to choose the new deal. However, if the restaurant is low-quality, they would rather avoid it. 
This suggests that if the restaurant is high-quality, the owner should try to increase the signal quality as 
much as possible by advertising or providing reviews. On the other hand, if the restaurant is low-quality, 
the owner may provide no information about the quality of this new restaurant. 

Considering the revenue of the restaurant, as shown in Fig. |8(c)| and |8(d)[ the optimal deal prices are 
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different for low- and high-quality restaurants. From the simulations, we can see that the optimal price for 
the high-quality restaurant is higher than the low-quality restaurant. This is because even the deal price is 
higher, customers who identify the true quality of the restaurant are still willing to choose this restaurant 
for higher quality of meals. We also observe that a higher signal quality increases the revenue of the 
high-quality restaurant but decreases the revenue of the low-quality restaurant. Therefore, a high-quality 
new restaurant should try to improve the signal quality in order to make potential customers identify its 
true quality. On the contrary, a low-quality new restaurant may try to hide his information and provide 
a lower deal price in order to attract more customers and thus lead to a higher revenue. 

VI. Conclusion 

In Part I of this two-part paper IT], we had proposed a new game, called Chinese restaurant game, 
to analyze the social learning problem with negative network externality. In Part II of this two-part 
paper, we illustrate three specific applications of Chinese restaurant game in wireless networking, cloud 
computing, and online social networking. In the spectrum access problem in wireless networking, we 
show that the overall channel utilization can be improved by taking the negative network externality into 
account in secondary users' decision process. The interference from secondary users to the primary user 
can also be reduced through learning from the sensing results of other users. In the storage platform 
selection problem in cloud computing, we show that customers automatically balance the loading of 
two platforms according to their knowledge on the platforms' infrastructures. The average loading is 
optimized under the best response strategies derived in Chinese restaurant game. In the deal selection 
on Groupon in online social networking, we show that the phenomenon of severe quality degradation of 
over-promotion through deals exists under the traditional social learning strategy, but not in the proposed 
Chinese restaurant game where customers can find a balance among several deals by taking the negative 
network externality into account. Moreover, we study how a new restaurant should strategically determine 
its deal price and advertising effort in order to maximize the revenue. When the restaurant's quality is 
high, he should try to advertise his restaurant as much as possible to convince customers to come for 
the high quality meals, while if the restaurant's quality is low, he should avoid advertising and provide 
a lower deal price for attracting those customers with little knowledge on the restaurant. 
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